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Abstract 

Continuous  time  Markov  chains  are  commonly  used  in  system 
performance  modeling.  Increasing  system  complexity  and  non- 
Markovian  behavior  can  drastically  increase  the  size  of  a  Markov 
model’s  state  space.  Accordingly,  approximation  techniques  have 
been  introduced  to  reduce  the  resources  needed  to  solve  Markov 
chain  models.  In  this  paper  we  discuss  a  method  for  automatically 
deriving  symbolic  solutions  of  Markov  chains.  Symbolic  solutions 
should  provide  insight  when  attempting  to  evaluate  the  validity  of 
both  Markov  models  and  approximation  techniques  for  their 
solution. 


1.  Introduction 

Continuous  time  Markov  chains  (CTMC)  are  commonly  used  tools  in  computer 
systems  modeling.  CTMC  have  been  used  to  model  program  behavior,1  system 
performance,8*3  system  reliability,4*5  and  system  availability,9  and  also  in  the 
combined  evaluation  of  performance  and  reliability.7* 9  Although  the  limitation  of 
exponentially  distributed  state  occupancy  times,  as  implied  by  a  homogeneous  CTMC. 
appears  to  be  restrictive,  it  is  possible  to  use  the  Coxian  method  of  stages  to  allow 
arbitrary  phase  type  distributions. s*8'10' 11 

In  general,  once  a  Markov  chain  model  of  a  system  has  been  constructed,  there 
are  several  solution  methods  available.  Figure  1  summarizes  these  methods  and 
typical  modeling  packages  that  employ  them.  The  Markov  model  of  a  system  can  be 
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solved  using  integral  equations,  formally  talcing  the  convolution  of  the  probability  of 
entering  the  state  with  the  probability  of  remaining  in  it.12*13  Alternatively,  the  Markov 
chain  can  be  converted  to  a  coupled  set  of  homogeneous  differential  equations.3  This 
set  of  equations  can  be  solved  using  either  numerical  techniques  or  Laplace 
transforms.  A  third  solution  method  is  simulation.  As  an  introduction,  we  will  briefly 
discuss  some  of  the  advantages  and  disadvantages  of  each  of  these  methods. 

Simulation  can  be  more  realistic  than  analytic  models.  Many  types  of  complex 
systems,  particularly  those  for  which  independence  assumptions  are  invalid,  can  be 
directly  modeled  using  simulation.  However  simulation  model*,  can  have  high 
development  costs  and  may  require  large  amounts  of  computer  time  to  obtain 
statistically  significant  results.  Therefore,  the  low  cost  alternative  of  analytic  modeling 
can  be  attractive,  even  if  approximating  assumptions  are  necessary.  If  a  range  of 
systems  must  be  compared,  similar  systems  must  often  be  simulated  individually. 
Analytic  modeling  may  permit  the  comparison  of  such  systems  without  repeated 
simulation. 

The  most  common  analytic  approach  is  to  represent  the  Markov  chain  as  a  set  of 
coupled  differential  equations.  Each  equation  describes  the  flow  "balance"  conditions  in 
a  corresponding  state  of  the  chain.  Le.  the  instantaneous  rate  of  change  in  the 
probability  of  being  in  a  state  is  equal  to  the  rate  of  arrival  into  the  state  less  the  rate 
of  departure  from  the  state  at  that  instant  This  set  of  equations  can  be  solved  using 
either  Laplace  transforms  or  numerical  techniques.  Using  Laplace  transforms  for  large 
systems  (either  numerically  or  symbolically)  may  require  finding  the  roots  of  many 
large  polynomials,  a  computationally  expensive  task.  One  advantage  of  numerical 
techniques  is  that  they  can  be  easily  extended  to  evaluate  non- homogeneous  Markov 
chains.  However,  if  the  system  is  stiff,  Le.  if  two  or  more  transition  rates  out  of  any 
single  state  differ  greatly  in  magnitude,  special  care  will  be  needed  to  get  an  accurate 
solution. 


Sets  of  Integral  equations  are  similar  to  coupled  differential  equations. 12  Integral 
equations  provide  a  basis  for  modeling  both  semi-Markov  processes  and  non- 
bomogeneous  Markov  chains.  Integral  equations  fbr  Markov  chain  state  probabilities 
will  also  provide  the  basis  for  the  closed  form  solution  techniques  discussed  later  in 
this  paper. 

One  issue  we  do  not  address  in  detail  is  the  solution  of  cyclic  Markov  chains.  In  the 
context  of  modeling  fault-tolerant  systems,  we  are  restricting  our  attention  to  non¬ 
repayable  systems.  Such  systems  can  generally  be  represented  by  Markov  chains 
without  cycles.  If  cycles  are  present  in  the  model,  all  the  solution  methods  discussed 
are  of  diminished  utility.  The  simulation  of  a  cyclic  Markov  chain  may  be  more 
expensive  (for  the  same  degree  of  accuracy)  than  for  an  acyclic  chain  of  the  same  size, 
as  the  number  of  possible  paths  through  the  chain  are  no  longer  finite.  If  an  analytic 
solution  at  a  cyclic  Markov  chain  is  desired,  numerical  solution  of  systems  of  either 
differential  or  integral  equations  is  usually  the  recommended  approach. 
Approximation  techniques  may  still  allow  us  to  obtain  a  symbolic  solutioa  albeit  an 
inexact  one. 

To  illustrate  the  use  of  Markov  models,  we  consider  an  example  from  reliability 
modeling.14  Figure  2  depicts  the  transition  diagram  of  a  Markov  chain  representing  a 
9-component  parallel  redundant  system.  The  individual  components  have  lifetimes 
that  are  independent  and  exponentially  distributed  with  parameter  A.  When  an 
individual  component  fails,  a  reconfiguration  process  with  rate  parameter  6  begins. 
This  process  is  guaranteed  to  reconfigure  the  system  as  long  as  a  second  fault  does  not 
occur  before  the  reconfiguration  is  completed.  The  reliability  of  the  system  at  time  t, 
denoted  R(t ).  is  given  by  1  -  Pfprocess  in  state  Ft  or  F,  at  time  t }. 

A  package  employing  traditional  (Le.  numerical)  solution  techniques  would  input 
numeric  values  for  the  parameters  of  the  model  (here  6  and  A)  and  would  solve  the 
system  numerically  for  f  less  than  some  fixed  value.  To  determine  the  actual  behavior 


of  the  system  es  a  function  of  a  parameter  other  than  t  would  require  many  runs  of 
such  a  program. 

In  this  paper,  we  discuss  a  method  for  the  derivation  of  state  probabilities  of  an 
acyclic  lfarkov  chain  in  a  symbolic  fashion.  Dosed  form  results  (that  previously  could 
be  obtained  only  by  hand)  give  greater  insight  into  actual  system  behavior  by  allowing 
us  to  easily  study  the  relationship  between  input  parameters  and  the  resulting  state 
probability  distributions.  Our  approach  is  based  on  the  use  of  integral  equations.  It  is 
computationally  similar  to  using  Laplace  transforms  to  solve  systems  of  coupled  ODE'S. 
The  program  implementing  the  algorithm  discussed  in  the  paper  is  called  ACE  (Jpyclic 
Markov  £fiam  Evaluator).  The  solution  of  cyclic  chains  (which  presents  additional 
difficulties)  is  omitted  from  our  discussion. 

In  section  2.  we  describe  our  method  which  is  partially  inspired  by  the  program 
SPADE16  In  section  3  we  describe  our  program's  implementation.  Some  examples  of 
the  use  of  ACE  are  given  in  section  4. 


Consider  an  acyclic  continuous  time  Markov  Chain.  Let  the  states  be  numbered 

1.2 . N.  Let  A,.** . A*  be  the  transition  rate  variables.  The  transition  rate  from  state 

i  to  state  j  is  denoted  by  q^,  where  qq  can  be  expressed  as  a  linear  sum  of  transition 
rate  variables,  Le. 


£  c(v)*\fc 
k*  1 


C(v)k  *(-*.-)• 


Further  let  denote  the  total  exit  rate  from  state  i. 


(1) 


For  any  state  i  of  an  acyclic  Markov  chain,  let  Pt{t)  be  the  probability  that  the 
system  is  in  state  i  at  time  t.  For  any  state  t,  P\(t)  may  he  written  as  a  polynomial  of 


5 


>*(*)  =  2  «**[2  «***]•  <a 
The  (act  that  the  state  probability  distributions  are  of  this  form  is  easily  derived.  First 
observe  that  the  initial  state(s)  probability(ies)  has  (have)  this  form.  The  probability  of 
being  in  (non-initial)  state  i  at  time  t  can  be  written  as: 
t 

W)  =  S  2  />,(*) (3) 

0  i«7(i) 

where  J(i)  is  the  set  of  states  with  a  transition  leading  to  state  i.  By  induction,  it  is 
easy  to  show  that,  if  every  Pj(x)  has  the  form  (2).  then  so  does  any  Pi(x)  derived  using 
(3). 


Ve  now  derive  the  equations  needed  to  calculate  the  constants  of  equation  (2)  for 
any  state  of  an  actual  acyclic  Uarkov  chain.  Let  S( i)  be  the  set  of  poles  of  the 
Laplace-Stieltjes  Transform  of  Pt{t).  Le.  the  set  of  the  7y's  of  (2).  Let  us  rename 
7*  4  and  define  S(.J(i))4  U  S(j).  Setting  N{j)^\S{j)\,  we  can  write 

S(J)  -  (7/ 1.7/z . 7 If  we  number  the  poles  then  Pj(t )  may  be  written  as: 


/v(o=T^'[T<vu“] 


i*  i 


*=0 


(4) 


L(j, l)  is  the  maximum  power  of  t  associated  with  pole  in  P;(t)  such  that  WO. 

M  N{j)  >  1.  it  is  easy  to  show  using  an  inductive  proof  that,  for  any  pole  y#, 

(®jti(jj)  *  0)  =>  Vk<L(j,l)  (ajU  *  0)  (5) 

V  P{J)=\.  then  P){t)  is  of  the  form  of*e‘»* .  This  corresponds  to  the  case  where  there  is 
only  one  directed  path  from  the  original  state  to  state  j  and  all  the  transition  rates 
along  this  path  are  all  equal.  i.e.  9/  =-7.  Only  in  this  case  is  the  implication  in  (5)  not 
satisfied.  Thus  Pt  (t )  may  be  written 

A(0  =  L  ^  f  *7**[L^l)  ■  (6) 

i-i  0  *-o 

But,  especially  when  the  Pdt)'t  have  common  poles,  i.e.  when  |S(.f(i))|  <  V  |S(j)j, 


this  expression  reduces  to 


Pd 0-  £  /«’*[  £  *£’ 

7*5  {/(<))  C  /=i  lw  7;  *«0 

=  2  (1 


r«5(/(*))  o  4=o 


where 


arfc  4  E  1(7<J  =  7)9* °jifc  •  7€S(/(< )) 


A(7)= 

Moving  the  integral  inside  the  summation  we  obtain 

Pi(*)  =  £  ^  t^oTg fxke<rr*dx  . 

7CS(/(i))  4*0  0 

The  resolution  of  this  integral  depends  on  whether  y  differs  from  7*.  It  7=7*,  then. 

f  z‘efrT)«<fc  =  f  il  dx  =  ***\ 

■4  {  (t+1) 

Otherwise,  when  S=(7-7* ).  integration  by  parts  yields 

j*.**  ..-[£(-i>»  #r- 


We  can  now  write  the  equation  for  P*(f ) 


*»> - _  Si J  w» 


7«S(/(i)) 


We  note  that 


5(<)  =  |7#|u5(J(t)) 


and  define 


(14) 


7 


(M7i) 

isl'LKHl  if7i=7#a 


I(i,i)  =  *ii(7*)+l  **71=7*  and  7  •eSUfi)) 
0  if7t=7*  a™1 7*&S(J (t)) 

From  (13)  note  that,  if  u  *  k  -4.  we  have 


^  t4,-,v  ^ 

-‘ff-w  £<-»—&-  3*=- 

t.  .ft  s/1  ^ 


*«0  «*0 


•i*n  ksu 


vi  *0  fc«u 

If  we  write  /\(<)  as 


5*+i- 


|=1  *-o. 


•  -  W  .  f.iyn  XLx  ■  tt7*--. 

r  wj  M 


Jf7i»*7,.wehBVe 


.  .  iJ _ 1 — —  Jb=0.....Ii(7i) 

a7lJt  -  I.fc  V  l)  a7*.*-  ki  (71-7#)r“*+1 

If  7*  #5(/ (<))  then 


Aft)  ,  _ —  (l8) 

^’w  *'  (7I-7*)**1 

Otherwise,  if  7*€S(/(i)). 

*  °7*Jfc  i_r)  r(7#)  ^ 

a7-^^i  *  (*TiT  *  0 . W  ]  ,  .  ,  .  t 

Wtb  these  equations,  »e  cm  eesily  compute  the  coetlicients  o(  the  polynonuols  m 
that  multiply  the  exponentials  in  the  state  probability  expressions. 


\ 
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SL  Implementation 

In  *Ht«  section  we  outline  the  procedure  used  by  the  ACE  program  to  compute 
state  probability  expressions  for  an  acyclic  Karkov  Chain.  Techniques  for  chains  with 
cycles  are  more  complex,  requiring  either  the  symbolic  solution  of  a  set  of  equations  or 
some  form  of  approximation.  We  also  briefly  discuss  the  operation  and  user  interface 
of  ACE. 

The  ACE  procedure  is  detailed  in  the  Appendix;  we  briefly  outline  it  here.  First  the 
states  are  sorted  according  to  the  partial  order  induced  by  the  transitions.  For  state 
4,  the  probability  expression  is  computed  by  first  determining  all  the  poles  of  states 
that  have  transitions  leadiqg  to  state  4.  If  the  pole  associated  with  state  i’s  outgoing 
transition  is  not  in  the  incoming  set,  coefficients  for  the  polynomials  multiplying  the 
incoming  exponential  terms  are  computed  using  equation  (17).  The  new  pole’s 
polynomial  multiplier  is  a  constant  computed  using  equation  (IB).  If  the  outgoing  pole 
is  also  in  the  set  of  incoming  poles,  the  degree  of  its  polynomial  will  be  incremented. 
The  new  coefficients  for  the  incremented  polynomial  can  be  computed  using  equation 
(19). 

ACE  is  being  developed  as  the  first  stage  of  a  testbed  for  aggregation  techniques. 
Two  versions  are  currently  being  implemented.  The  first  version  supports  an  unlimited 
number  of  symbolic  variables  but  generates  answers  that  are  symbolic  only  in  the 
poles  (the  powers  of  the  exponential  terms).  The  coefficients  of  the  polynomials  in  t 
that  multiply  the  exponentials  are  numeric.  The  second  version  of  ACE  is  fully 
symbolic  in  one  variable  and  numeric  in  other  variables,  Le.  the  coefficients  of  the 
polynomials  in  t  that  multiply  the  exponentials  have  both  numeric  and  symbolic  parts. 
This  allows  us  to  conduct  a  parametric  sensitivity  analysis  in  a  fully  symbolic  fashion. 
Eventually  these  two  methods  will  be  combined  yielding  completely  symbolic  poles  and 
coefficients  that  are  fully  symbolic  in  at  least  one  variable. 


*_•  *  •  Vj  * Jk  *  .'j 
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Several  problems  have  arisen  in  constructing  the  ACE  package.  When  computing 
symbolic  coefficients,  the  size  of  the  coefficients  grows  linearly  with  the  number  of 
symbolic  variables  used  along  all  paths  to  the  state.  The  coefficients  rapidly  reach  an 
unmanageable  size,  even  for  a  small  chain.  Restricting  the  lengths  of  the  paths 
through  the  chain  would  greatly  reduce  the  package's  utility,  particularly  for  chains 
that  are  ’long"  (e.g.  simple  death  processes).  Instead,  we  restrict  the  number  of 
symbolic  variables  that  are  maintained  in  a  given  run  of  the  program.  All  variables  not 
treated  symbolically  are  merged  numerically.  If  poles  are  still  maintained  in  a  fully 
symbolic  fashion,  care  must  be  taken  to  correctly  merge  the  numeric  values  of 
symbolically  different,  numerically  identical  poles. 

Further  efforts,  include  the  construction  of  a  "user-friendly"  interface  and  the 
addition  of  a  block  definition  and  solution  facility.  The  user  will  be  able  to  define 
blocks  of  states  with  fixed  entry  and  exit  points.  The  blocks  could  be  evaluated  by 
direct  insertion  of  their  states  into  the  chain.  Alternatively,  the  block  could  be  solved 
in  isolation  using  symbolic  or  numerical  approximation  methods.  This  capability 
should  further  facilitate  the  use  of  ACE  in  evaluating  aggregation  methods. 

4.  Examples  and  Conclusions 

In  this  section  we  demonstrate  the  use  of  ACE-like  symbolic  computation.  We 
begin  by  symbolically  solving  the  example  given  in  figure  2  using  the  method  described 
in  sections  2  and  3.  We  then  apply  a  simple  aggregation  technique  to  the  chain  and  re¬ 
solve  the  system.  We  give  examples  that  demonstrate  the  utility  of  a  symbolic  solution 
for  bounding,  sensitivity  analysis,  and  comparison  of  aggregation  techniques. 

4.1  Exact  Solution  of  3-Component  System 

Given  the  chain  shown  in  Figure  2,  we  follow  the  algorithm  outline  given  in  section 
3.  We  first  observe  that  the  only  y  for  the  probability  distribution  of  state  3  is  -3X.  As 


state  3  has  no  parent  the  constant  a'li0=l.  Accordingly  we  can  write: 


p*(t)  =  '-*». 

(20) 

We  can  continue  following  the  algorithm  sketch  and  derive  the  following  equations  for 

the  functioning  states: 

P*x(t)  =  -  rV«'SM  +  rT6'(2A+‘,‘ 

X— o  X— 5 

(21) 

Pz(t)  =  -  -2i-e -<»♦«)<  +  3e-*« 

A— 0  A — 0 

(22) 

P.p(t)  =  -  Bx<5  p-axi  .  6X  -(2x*«X 

'  (X-5)(2X— 5)  8  +  X-5  B 

(23) 

6X5 


6A  _  _ gx< 

X-5  (A-5)(2X-5) 


Pi(t)  = 


35* 


(X-5)(2X-5) 
6X5 


.e-a«  - 


_  6X5  r-m*ni  .  65  -gxi 

(X— 5)(X+5)  X— 5 

35 


(24) 


-  ......  •  ~{X*iM  ,  •*>  -  -XI 

(A-5)(2X-5)  *  +  (X+5) 

For  the  state  that  corresponds  to  failure  due  to  exhaustion  of  components  we  can 


write: 


p,  (t)  = _ 2__e-axi  .  _ 6X86 _  -{2x+«)»  _  35  -zxi 

r'K  *  (X-5)(2X-5)  e  +  (X— 5)(X+5)(2X+5)  X-5  (25) 


5f_ 

1(2 

6X*5 


35 

T  (A— 5)(A+5)(2A-5)  "  X+5  ‘  '  (2 X+5)(X+5) 

For  the  state  that  corresponds  to  a  coverage  failure  we  can  write: 


_  _2£_e-w  + 


58 


p  it\  _  /  2A  2X5 _ 

*>.<*)  "  <A-6  (A— 5)(2A-5) 


)  e-aM  - 


12XE 


(X-5)(2X+5) 


,e-(8X*«)»  +  UA  q  —2X1 


3X 

X-5 


6 X*5 


.e -{*♦«)«  +  + 


X5 


(X— 5)(X+  5)(2X— 5)  2X+5  (2X+5)(X+5) 

We  note  that  the  reliability  of  the  3-component  system  is  given  by 


(26) 


*(0=i-(/V.(0  +  /V,(0)  (27) 

To  derive  the  information  contained  in  this  symbolic  reliability  expression,  even  a 

highly  flexible  conventional  reliability  evaluation  package  would  require  several  runs 
for  different  parameter  values.  For  example,  to  see  the  effect  of  the  fault-handling 
rate  we  consider  the  reliability  expression  as  a  function  of  delta.  Fixing  X=10-4 


failure /hour  and  t-10  hours,  in  Figure  3  we  graph  -toga  of  the  system  unreliability  as 
a  function  of  -log  l0  of  the  mean  time  to  handle  a  fault.  When  faults  are  handled 
quickly  (  in  this  case  about  100  ms),  we  see  that  the  reliability  of  the  system 
approaches  that  of  a  system  with  perfect  coverage.  When  fault-handling  is  slow 
(minutes  or  hours),  imperfect  coverage  dramatically  reduces  the  system's  reliability. 

4.2  An  Approximate  Solution  of  the  3-Component  System 

When  Markov  models  are  used  for  realistic  systems,  the  state  space  often  grows 
beyond  practical  limits.  Accordingly,  reliability  evaluation  packages  often  use  various 
aggregation  or  lumping  methods  to  reduce  the  size  of  the  state  space.  For  example,  a 
system  model  can  be  decomposed  into  sub-models  of  fault-handling  and  fault- 
occurrence  behavior.10  Short  of  assuming  all  faults  are  successfully  bandied,  one  of  the 
simplest  approaches  is  to  condense  the  second  fault  rate  and  fault-handling 
parameters  into  a  single  constant  c  denoting  coverage,  the  probability  that  an 
arbitrary  fault  is  successfully  handled.  When  this  approach  is  used  with  our  example, 
we  obtain  the  chain  shown  in  Figure  4.  Its  state  probability  equations  are 


Psa(  0  =  «_ai  (23) 

Ptx(t )  -  -3c ,*-»*+  3c (29) 
Pu(t)  -  3ciC8  t"*1  -  6cjCj e_fM  *3ctcte~M  (30) 

Pr,(t)=  -CjCga-*1  +3c,c8e-*u  -3c,c2e-M  +  c,e8  (31) 


J>,(0*  (2{1-C8)c,  -(l-c,))*-8*  -ScjO-c,)®-**1  +  (1-c,)  +  c,(l-c8)  (32) 

As  in  the  original  chain, 

*(0*  l-(Fy,(f  )  +  />,«)) 

One  interesting  problem  is  correctly  choosing  values  for  the  c  parameter.  If  two  c 
parameters  were  employed,  as  in  our  example,  the  instantaneous  coverage 
approximation  would  usually  be 


However,  if  only  a  single  coverage  parameter  were  chosen,  the  choice  might  well 
depend  on  the  period  of  time  over  which  we  were  interested  in  evaluating  the  reliability 


of  the  system. 

With  the  reliability  expressions  for  both  the  aggregated  and  original  Markov 
chains,  we  can  evaluate  the  acceptability  of  the  aggregation  scheme  by  comparing  the 
results  they  produce.  Figure  4  shows  the  graphs  of  three  estimates  of  -logic  of  system 
unreliability  as  a  function  of  logi0  of  t.  With  X  fixed  at  10"4  failures/hr.  for  all  three 
curves,  the  lower  curve  is  derived  by  solving  the  original  Markov  model  in  Figure  2  with 
6=1.  lining  this  6  value,  the  middle  curve  is  derived  using  the  aggregated  chain  in 
Figure  4  and  instantaneous  .coverage  estimates  derived  from  equations  (33)  and  (34). 
The  upper  curve  is  derived  using  a  naive  perfect  coverage  model  Le.  fault  handling  is 
assumed  to  always  succeed  instantaneously.  Even  for  tins  contrived  situation  (X  and  6 
are  probably  much  closer  in  value  than  they  would  be  in  practice),  we  see  that  a 
constant  coverage  assumption  still  can  provide  a  good  estimate  of  system  reliability. 
For  our  particular  example,  if  a  more  realistic  6  value  is  chosen,  the  reliability 
estimates  provided  by  the  original  and  aggregated  chains  are  essentially  identical 
Extending  validation  of  a  simple  approximation  scheme  for  a  small  model  to  more 
realistic  models  may  require  significant  effort. 

Symbolic  solutions  of  CTMC  should  provide  at  least  two  benefits.  First,  it  should  be 
possible  to  compare  the  results  obtained  by  exact  and  approximate  solution  methods 
for  small  to  medium  sized  CTMC.  fly  indicating  the  magnitude  of  error  that 
approximate  solutions  introduce,  this  type  of  analysis  should  provide  a  good  indication 
oT  an  aggregation/approximation  technique's  utility  for  larger,  more  realistic 
problems.  Second,  symbolic  solutions  allow  us  to  easily  examine  the  influence  of 
changing  parameter  values  on  the  solutions  of  Markov  models.  This  type  of 


investigation  could  be  very  expensive  using  conventional  simulation  or  numerical 
solution  techniques.  By  providiiy  easy  access  to  symbolic  solutions  of  CTMC.  the  ACE 
package  should  enhance  our  ability  to  study  Markov  reliability  models,  and 
approximation  techniques  for  their  solution. 


1.  D.  F.  Towsley,  J.  C.  Browne,  and  K  U.  Chandy,  “Models  for  Parallel  Process¬ 
ing  Within  Programs:  Applications  to  CPU-.l/O  and  l/0:l/0  Overlap,"  CACU, 
pp.  821-31.  October  1978 

2.  Steven  A.  Lavenberg,  ed.,  Computer  Performance  Modeling  Handbook, 
Academic  Press,  1983. 

3.  Kishor  S.  TVivedi,  probability  St  Statistics  vAth  Reliability,  Queuing  Sr  Com¬ 
puter  Science  Applications,  Preiitice-Hali.  196Z 

4.  Ying-Wah  Ng  and  Algirdas  Avizienis,  “A  Unified  Reliability  Model  for  Fault- 
Tolerant  Computers,"  IEEE  Transactions  on  Computers,  pp.  1002-1011, 
November  I960. 

5.  Robert  Geist  and  Kishor  Trivedi,  "Ultra-High  Reliability  Prediction  for 
Fault-Tolerant  Computer  Systems,"  IEEE  Transactions  on  Computers,  pp. 
1118-1127,  December.  1963. 

6.  Daniel  P.  Siewiorek  and  Robert  S.  Swarz,  The  Theory  and  Practice  of  Reli¬ 
able  System  Design,  Digital  Press,  1992. 

7.  John  F.  Meyer,  "On  Evaluating  the  Performability  of  Degradable  Computer 
Systems."  IEEE  Transactions  on  Computers,  August  1980. 

8  Ambuj  Goya!  and  Asser  Tantawi,  "Evaluation  of  Performability  in  Acyclic 
Markov  Chains,"  IBM  Computer  Science  Research  Report.  May  1984. 

9.  D.R.  Cox.  "The  Use  -of  Complex  Probability  in  the  Theory  of  Stochastic 
Processes,"  in  Proceedings  Cambridge  philosophical  Society,  1955. 

10.  Marcel  F.  Neuts,  Matrix  Geometric  Solutions  in  Stochastic  Models:  An  Algo¬ 
rithmic  Approach,  John  Hopkins  University  Press.  1981. 

11.  A.  Costes,  J.  E.  Douce t,  C.  Landrault,  and  J.C.  Laprie.  "SURF:  A  Program  for 
Dependability  Evaluation  of  Complex  Fault-Tolerant  Computing  Systems,"  in 
Proceedings  IEEE  11-th  Fault  Tolerant  Computing  Symposium,  pp.  72-78, 
June  1981. 

12.  Sheldon  M.  Ross,  Stochastic  Processes,  J.  Wiley  It  Sons,  19B3. 

13.  J.  J.  Sliffler,  L  A.  Bryant,  and  L.  Guccione,  "CARE  ID  Final  Report  Phase  1," 
NASA  Contractor  Report  159122,  November  1979. 

14.  Kishor  Trivedi  and  Robert  Geist,  "A  Tutorial  on  the  CARE  111  Approach  to 
Reliability  Modeling."  NASA  Contractor  Report  3488,  1981. 

15.  Robin  Sahner  and  Kishor  Trivedi,  "SPADE-  Series-Parallel  Directed  Acyclic 
Graph  Evaluator,"  Technical  Report  CS-1964-15,  Department  of  Computer 
Science,  Duke  University,  1984.  Submitted  for  Publication 

16.  Kishor  Trivedi.  Robert  Geist,  Mark  Smotberman,  and  Joanne  Bechta  Dugan, 
"Hybrid  Modeling  of  Fault-Tolerant  Computer  Systems,"  International  Jour¬ 
nal  of  Computers  and  Electrical  Engineering,  1984.  To  appear,  special 
issue  on  "Reliability  and  Verification  of  Computing  Systems." 

17.  R.  B.  Conn,  P.  M.  Berryman.  and  K.  L  Whitelaw,  "CAST  -  A  Complimentary 
Analytic-Simulative  Technique  for  Modeling  Feult-Tolerant  Computing  Sys¬ 
tems."  in  Proceedings  AIAA  Computers  in  Aerospace  Conference,  Los 
Angeles,  November,  1977, 

16.  S.V.  Makam  and  A  Avizienis.  "ARIES  81:  A  reliability  and  life-cycle  evalua¬ 
tion  tool  for  fault-tolerant  systems,"  in  proceedings  IEEE  12-th  Fault 
Tolerant  Computing  Symposium,  pp.  287-274,  June  1982. 


OVEN:  An  acyclic  Markov  chain  with  state  space  il,2,....Nj  sorted  so  that  the  ancestors 
of  a  every  state  precede  it  in  the  list  (for  convenience). 


EBON 

TOR  each  state  s  e{  1,2 . N\ 

Determine  J(s )  the  set  of  states  with  transitions  leading  to  5 
Determine  S(J(s))  and  \U(y) .  yeS(J(s)) {  using  (9)  and  (15) 

Compute  Of*  Jk=0 . U(j)>  7 eS(/(i)) 

Let  7*  *  ft  and  o^.c=0 

Note:  7*  for  an  absorbing  state  is  0. 

FDR  7€5(/(t)) .  7*7* 

FDR  **0.1 . Z»(7) 

Compute  o^fc  using  formula  (17) 

Accumulate  <V,0  *  o7.j  +  (-1)**1  •  **^^^77 

END  IX* 

END  FOR 


If  7*€S(/(t)),  compute  from  (19)  for  k  =0 A (7*) 

/i(0*  E 

*-o 


END  FDR 


Method:  Simulation 

Package:  CAST17 

Note:  Allows  full  simulation  of  a  restricted  class  of  systems 

Method:  Differential  Equations  (Numerical  Solution) 

Package:  HARP5 

Domain:  Homogeneous  and  Non-Homogeneous  acyclic  CIMC 

Package:  SAVE* 

Domain:  Cjydic  CTMC 

Method:  Differential  Equations  (Laplace  Solution) 

Package:  SURF11 

Domain:  Non-Markov  Processes 

Note:  Approximate  solution  using  Coxian  method  of  stages 

Method:  Integral  Equations  (Numerical  Solution) 

Package:  Care  III  Coverage  Model19 

Domain:  Semi-Markov  Processes 

Package:  Care  in  Reliability  Model19 

Domain:  Non-Homogeneous  CTMC 

Method'  Closed  Farm  Solution 
Package:  ARIES18 

Domain:  Cyclic  Homogeneous  CTMC 

Note:  Poles  and  their  coefficients  derived  numerically 

Package:  ACE 

Domain:  Acyclic  Homogeneous  CTMC 

Note:  Poles  and  their  coefficients  derived  symbolically 


Figure  1:  Reliability  Modeling  Packages  Employing  Markov  Chain  Techniques 
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figure  5:  Aggregation's  Effect  on  Reliability  Estimates 
The  bottom  curve  vs  an  estimate  of  reliability  derived  from  tbe  original  Markov  chain 
The  middle  curve  is  derived  from  the  aggregated  chain  in  Figure  * 


The  top  curve  is  a  prrf  ec t  coverage  estimate 
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